Automatic Construction of Temporally Extended Actions for MDPs Using Bisimulation Metrics

نویسندگان

Pablo Samuel Castro

Doina Precup

چکیده

Temporally extended actions are usually effective in speeding up reinforcement learning. In this paper we present a mechanism for automatically constructing such actions, expressed as options [Sutton et al., 1999], in a finite Markov Decision Process (MDP). To do this, we compute a bisimulation metric [Ferns et al., 2004] between the states in a small MDP and the states in a large MDP, which we want to solve. The shape of this metric is then used to completely define a set of options for the large MDP. We demonstrate empirically that our approach is able to improve the speed of reinforcement learning, and is generally not sensitive to parameter tuning.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using bisimulation for policy transfer in MDPs

Knowledge transfer has been suggested as a useful approach for solving large Markov Decision Processes. The main idea is to compute a decision-making policy in one environment and use it in a different environment, provided the two are ”close enough”. In this paper, we use bisimulation-style metrics (Ferns et al., 2004) to guide knowledge transfer. We propose algorithms that decide what actions...

متن کامل

Basis refinement strategies for linear value function approximation in MDPs

We provide a theoretical framework for analyzing basis function construction for linear value function approximation in Markov Decision Processes (MDPs). We show that important existing methods, such as Krylov bases and Bellman-errorbased methods are a special case of the general framework we develop. We provide a general algorithmic framework for computing basis function refinements which “res...

متن کامل

Metrics for Markov Decision Processes with Infinite State Spaces

We present metrics for measuring state similarity in Markov decision processes (MDPs) with infinitely many states, including MDPs with continuous state spaces. Such metrics provide a stable quantitative analogue of the notion of bisimulation for MDPs, and are suitable for use in MDP approximation. We show that the optimal value function associated with a discounted infinite horizon planning tas...

متن کامل

Representation Discovery for MDPs Using Bisimulation Metrics

We provide a novel, flexible, iterative refinement algorithm to automatically construct an approximate statespace representation for Markov Decision Processes (MDPs). Our approach leverages bisimulation metrics, which have been used in prior work to generate features to represent the state space of MDPs. We address a drawback of this approach, which is the expensive computation of the bisimulat...

متن کامل

ar X iv : 0 80 9 . 43 26 v 2 [ cs . G T ] 9 O ct 2 00 8 Algorithms for Game Metrics ( Full Version

Simulation and bisimulation metrics for stochastic systems provide a quantitative generalization of the classical simulation and bisimulation relations. These metrics capture the similarity of states with respect to quantitative specifications written in the quantitative μ-calculus and related probabilistic logics. We present algorithms for computing the metrics on Markov decision processes (MD...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2011

Automatic Construction of Temporally Extended Actions for MDPs Using Bisimulation Metrics

نویسندگان

چکیده

منابع مشابه

Using bisimulation for policy transfer in MDPs

Basis refinement strategies for linear value function approximation in MDPs

Metrics for Markov Decision Processes with Infinite State Spaces

Representation Discovery for MDPs Using Bisimulation Metrics

ar X iv : 0 80 9 . 43 26 v 2 [ cs . G T ] 9 O ct 2 00 8 Algorithms for Game Metrics ( Full Version

عنوان ژورنال:

اشتراک گذاری